perm filename P[AP,DBL]3 blob sn#141985 filedate 1975-01-28 generic text, type T, neo UTF8
COMMENT ⊗   VALID 00019 PAGES 
RECORD PAGE   DESCRIPTION
 00001 00001
 00003 00002	.DEVICE XGP
 00005 00003	5↓_ABSTRACT_↓*
 00007 00004	5↓_1. Motivation_↓*
 00010 00005	5↓_2. Task_↓*
 00014 00006	5↓_3. Target program_↓*
 00019 00007	5↓_4. Annotated protocol_↓*
 00023 00008	5↓_5. The BEINGs scheme_↓*
 00045 00009	5↓_6. Control in the system_↓*
 00052 00010	5↓_7. Theoretical aspects of PUP6_↓*
 00060 00011	5↓_8. Ideal and real systems_↓*
 00070 00012	5↓_9. Questions for AP systems_↓*
 00078 00013	5↓_10. Examples from the dialogue which results in CF_↓*
 00091 00014	5↓_11. Excerpt from the synthesized program itself running_↓*
 00096 00015	5↓_12. Other Tasks_↓*
 00102 00016	5↓_13. Numerical efficiency data_↓*
 00106 00017	5↓_14. Conclusions_↓*
 00119 00018	5↓_References_↓*
 00123 00019	5↓_Appendix: the BEINGs_↓*
 00127 ENDMK
⊗;
.DEVICE XGP
.!XGPCOMMANDS←"/TMAR=100/PMAR=2200/BMAR=100"

.FONT 1 "BASL30"
.FONT 2 "BASB30"
.FONT 3 "NGR25"
.FONT 4  "BASI30"
.FONT 5  "NGR40"
.FONT 6  "NGR20"
.FONT 7 "FIX20"
.TURN ON "↓_π{"
.TURN ON "⊗" FOR "%"
.PAGE FRAME 41 HIGH 91 WIDE
.AREA TEXT LINES 1 TO 39
.AREA FOOTING LINE 41
.!XGPLFTMAR←100
.SPACING 100 MILLS
.PREFACE 230 MILLS
.NOFILL
.PAGE FRAME 41 HIGH 91 WIDE
.AREA TEXT LINES 1 TO 39
.AREA FOOTING LINE 41
.PREFACE 45 MILLS
.FILL
.NEXT PAGE
.MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
.MACRO BB ⊂ BEGIN NOFILL SKIP 2 SELECT 3 INDENT 0 GROUP ⊃
.MACRO E ⊂ APART END ⊃
.TABBREAK
.PAGE←0
.EVERY FOOTING(⊗7{DATE},Lenat,page  {PAGE}⊗*)
.NEXT PAGE
.BEGIN CENTER
⊗5SYNTHESIS OF LARGE PROGRAMS FROM SPECIFIC DIALOGUES⊗*

Douglas B. Lenat
Stanford Artificial Intelligence Laboratory
Stanford, California

⊗5↓_ABSTRACT_↓⊗*

.END

Automatic programming must eventually deal with large programs. 
Using the paradigm of dialogue, a preliminary study was done on
generating programs tens of pages long, requiring
several hours of user-system interaction time to synthesize. Many
assumptions which are reasonable when concentrating upon tiny
targets become limiting factors in large dialogues. An
experimental system, PUP6, was constructed.
The methods it employs to generate target code are not
formal, but rather involve structuring of knowledge about
programming, about the chosen task domain (inductive inference
LISP programs), and about transfer of control. Specification is
via long, brittle dialogues with the user, but is only partial:
ambiguities are resolved by inference and deferral.
Knowledge is represented as a pool of structured modules, whose
abilities include asking and answering questions posed by each
other and by the user.  Inadvertantly, a new style of target
program was synthesized: like PUP6, it can be interrupted as it runs
and queried about what it's doing.
This research revealed some major difficulties
the "next generation" of nonformal automatic
program synthesis systems must face.

.B

.E
.ONCE CENTER
⊗5↓_1. Motivation_↓⊗*

Many AI researchers (e.g., Sussman, Goldstein) have experimented with the
generation of small programs, from partial specificiations in an ambiguous 
language. They recognize
that larger capabilities are prerequisite to utility, but (correctly)
argue that mastery of toy tasks must precede mastery of real ones.
Efforts on synthesis of ⊗4large⊗* programs concentrate on
efficient but routine transformations, compilations
of well-specified programs in higher languages
(e.g., by heuristic compilers).  
	Despite the small chance for success in aiming at ambitious targets, it
it is dangerous to develop techniques applicable only to small tasks.
The analogy between understanding the
construction of a 10-line program, in five minutes of dialogue, and
understanding the
construction of a 100-page program, in five hours of dialogue, is at
best superficial.
	A system, PUP6, was partially implemented to study large 
code generation tasks. It did
ultimatedly synthesize three  LISP programs: a structural concept formation
program, a grammatical inference program, and a simple property
list maintenance routine (referred to as CF, GI, and PL). Some unanticipated
dificulties were encountered which may be inherent in syntheses of this
magnitude.

.B

.E
.ONCE CENTER
⊗5↓_2. Task_↓⊗*

	The task of synthesizing
large INTERLISP[9] programs, from long specific 
dialogues with a user, in a
restricted subset of English, was considered. This activity will be referred to
as ⊗4automatic programming⊗*. 
	The first question to settle in such an endeavor is what the
⊗4target programs⊗* (the generated code) 
are going to be.  More generally, what is the  domain of the target
programs? A large amount of effort was spent on this question, and the
chosen domain was inductive inference programs. The
obvious problem here is the size and sophistication of the targets, but there
are four big attractions:
	(i)
A  wide  range  of  complexity  exists,  from  a  one-page   sequence
extrapolator   to   Meta-Dendral.
	(ii)   Each   increasingly
sophisticated inductive inference program  uses  
many  of  the  concepts  embodied  in
simpler  inductive inference programs.
	(iii) If a super-human target program is
ever written, it could itself contribute to the  field  of  Automatic
Programming! (This remark is humorous in the seventies, but may be
commonplace someday.)
	(iv)  Since people (especially scientific researchers)
are the inductive
inference experts, our reliance on  introspection  is  as
valid  --  and potentially as valuable -- as chemists' protocols were
to Dendral.
	After  studying  many  sample  programs  from  the inductive inference domain,
sequence extrapolation seemed the most reasonable  beginning
task. It was quickly learned that this was too easy: humans have only
a few techniques for extrapolating  sequences,  and  a  very  limited
capacity   for   composing   them.   Thus  a  rather  rigid  sequence
extrapolator writer may be capable of generating  a  large  class  of
super-human extrapolation programs (see section 4.2 on
Sequence-Extrapolator-Writer, in [3]).
	The  next  candidates  were grammatical inference and concept
formation [4].
Determined not to choose too simple  a  task  again,  the
latter program was finally decided upon.  The particular target was similar to
[10], except Winston's heuristic
graph-matching algorithm was simplified.

.B

.E
.ONCE CENTER
⊗5↓_3. Target program_↓⊗*

	It seems instructive now to describe in detail 
how CF, the  target  program,
should operate. It repeatedly scans a scene and tries to name it. The
scene is broken into a set of features and a set of objects. Each 
feature is  a  relation
on  one  or  more  objects  in  the  scene.  Internally,  CF
maintains a model for each differently-named 
scene it has ever encountered.
The
model  contains  a  description  of  the objects expected in the
scene, a set of features which must be present in any scene having this
name, a set of features which must not
be present in the scene, and a set of features which may or  may  not
be  present.  Thus  a model is an archtypical scene plus a name.
Each time it is confronted by a new scene, CF  must
scan its models until it finds one which matches it. A model is said
to match a scene if all the MUST features associated with that model
are observed in the scene, and all the MUSTNOT  features  are
absent  from  the   scene. CF
informs the user of this guess,
and accepts the proper  name. If it  guessed  incorrectly,  it
modifies its models. The wrong-guess model may have features added to
its MUST or MUSTNOT sets.
This prevents CF from making the same wrong guess twice in succession.
The correct-name model may have  to
be  modified or (if it's a new name) created and inserted into the
list of models. 

.B

For example, part of a scene might be:     
	OBJECTS a,b,c,d
	RELATIONS  (GREEN a) (BLUE c) (TOUCHES c d) (SUPPORTS a c) (SUPPORTS b c)

CF's current model for an arch might be:      	   
	NAME    ARCH
	OBJECTS a,b,c
	MUST 	(SUPPORTS a c) (SUPPORTS b c)
	MUSTNOT (TOUCHES a b)
	MAY	(GREEN a) (WEDGE c) (PRISM a) (BLOCK b) (PARALLEL a b)
.E
	Suppose  that the target program reads in the above
scene fragment and
tries to match it to  the  above  model  for  consistency.  The  MUST
relations  should  all  be  present.   Yes,  the  scene  does contain
(SUPPORTS a c) and (SUPPORTS b c). Next, the MUSTNOT  relations  must
be absent from the scene. Sure enough, (TOUCHES a b) isn't there.  So
the model and scene are consistent, and the program announces that its
guess is ARCH.  
If the user verifies this guess,
then  the MAY set of the ARCH model
is augmented with the relations (BLUE c) and (TOUCHES c d), and
the OBJECTS set is augmented with "d."
	If the user denies that the scene is an arch, CF
sees if there are any relations in the ARCH model's MAY set which do not
occur in the scene. If so, one of them (e.g., (PARALLEL a b)) will
be transferred from the MAY to the MUST set.  If no such feature 
existed, the program would look for a feature present in the scene
but not mentioned in any set of the ARCH model (e.g., (TOUCHES c d)), and insert
it into the MUSTNOT set.  In either case, the user would
be asked what the true name was, and that
model would have its MAY set augmented by any new 
features in the scene. Any features on  the true-name model's
MUST or MUSTNOT sets
which contradicted the scene would be transferred to the MAY set.

.B

.E
.ONCE CENTER
⊗5↓_4. Annotated protocol_↓⊗*

	After the target concept formation program was specified,  it
was trimmed and then rewritten as a structured program [2]. Next,
a complete dialogue was  simulated  between  the  user  and  a  human
programmer  (referred to as the system-player) playing the role of an
"intelligent"  automatic  programming  system  (similar   to,   e.g.,
[1]).      The   system-player   kept   careful  records  as  he
programmed, and tried to create a bug-free  structured  program.  The
dialogue  was  then annotated:     after each user response, comments
were inserted which described the "states" the system-player had gone
through before printing his next response.  This included blind paths
which were tried, use of outside world knowledge,  and,  in  general,
was  meant  to  be  the "reasoning" necessary to do the task.  The
fear was that a system could be built which synthesized CF,
yet  was  so unintelligent that nothing was
learned from it (e.g., see section 4.1 on PW1  in  [3]).
Henceforth, ⊗4protocol⊗* will
refer to this user-player / system-player simulated dialogue.
	The central goals in writing PUP6 were that it
(i) correctly generate the target program CF,
(ii) follow the
protocol, and, most importantly, (iii) go
through  the  same  line of reasoning that the comments indicated the
system-player 
followed. 
PU6 would be
far along the road toward utility if it also  (iv)  could
write CF  from several dialogues, from several users, with
little preparation. PUP6 was not designed to do this, and in the  end
it  proved to be a crippling deficit.
	A corollary of this incremental annotated protocol approach is that the
abilities of the user must coincide with those  of  the  subject  who
participated  in the protocol.  To be successful,
the user must be familiar
with  LISP,  well-grounded  in  computer  science,   and   have   the
input/output  behavior  of CF, the concept  formation  target program, very
clearly in his mind. The natural language front  end  is  so  brittle
that  the user must also phrase his responses very similarly to those
of the original protocol user-player.


.B

.E
.ONCE CENTER
⊗5↓_5. The BEINGs scheme_↓⊗*

	The next major design issue is the internal mechanism for
achieving this task.  The earlier PUP programs [3] had taught the
author to organize the system knowledge in ways meaningful to how it
would be needed. This need for structuring is also echoed in [7].
Many people, however, champion the merits of uniformity:  in particular,
conceptually easy addition of new knowledge, and more
standardized (hence simpler) communication and interaction between
the knowledge.  Not wishing to give up these advantages, a compromise
mechanism, BEINGs, was developed.
	A BEING is a loosely-knit little expert, a collection of about
thirty different questions and their answers.  Each such part of a BEING, called
an ⊗4organ⊗*, is
therefore an explanation of some facet of the BEING.  
In other words, a specialist is
considered ⊗4equivalent⊗* to its answers to a couple dozen queries. In
answering a question, all a BEING may do is to create a new BEING, fill in
some BEING's answer to some question, and ask questions (of itself, of other
BEINGs, of the human user interacting with  the system).

So far, this representation is isomorphic to the ACTOR[5] formalism.
The structure is present, all right, but what about the uniformity? It
seems that one can't add a new BEING (write its organs) unless he knows
the particular questions he can ask each other BEING in the system (the names
of their organs), and
that one BEING can't communicate with another unless it specifically
knows what that BEING can be asked.  
 	Both problems theoretically disappear if
the set of questions is ⊗4fixed⊗* once and for all.  That is,
let every BEING consist of answers to a single set of questions (have
the same set of organs), and
let the user and every BEING have access to (and implicit understanding of)
the names of these questions. 
	Of course the number of questions (organs) is arbitrary, but it
should not be ignored completely.  By choosing it to be ~1,
a completely uniform representation is obtained.
By allowing it to be ~1000, one could simply take the
union of all the questions any BEING wanted, and thus all the uniformity
would disappear.  The number of organs each BEING has is therefore a parameter
expressing the degreee of compromise between structure and standardization
in the community of BEINGs.
	The specific BEING organs chosen have much to do with the
epistemology of programming.  The tentative set of organs implemented in
each BEING in PUP6 are listed below, followed by a brief
description. Also listed is the percentage of the original PUP6 pool 
which actually wanted the organ specified.

.BEGIN NOFILL
.FLUSH LEFT

⊗2IDEN⊗* 54%  How is this BEING referenced in English phrases? Implemented as productions.
⊗2ARGS⊗* 63%  How many arguments? Which are optional? Are there any local variables?
⊗2ARG-CHECK⊗* 81%  Predicates which examine each argument for suitability.
⊗2EVAL-ARGS⊗*  4%  Which arguments are to be evaluated, and which quoted?
⊗2WHAT⊗* 82%  A brief summary of the global purpose. Usually a template for an English sentence.
⊗2WHY⊗* 77%  A justification of the BEING's existence. The caller explains here why it was called.
⊗2HOW⊗* 72%  A summary of the methods the BEING intends to employ. Global strategies.
⊗2EFFECTS⊗* 27%  What will probably be true after this BEING is through? What can it achieve?
⊗2WHEN⊗* 19%  Why should this BEING be given control now? Computed as the sum of weighted factors.
⊗2META-CODE⊗* 70%  The body of the executable code, but with uninstantiated subparts.
⊗2COMMENTS⊗* 16%  Hints for filling in undefined sections of other BEING organs.
⊗2REQUISITES⊗* 10%  If this BEING is actually chosen, what must be made true just before (pre-),
      during (co-), and after (post-) execution?  Work to make these true (unlike ARG-CHECK).
⊗2DEMONS⊗* 7%  Which demons should be kept active while the BEING is in control?
⊗2AFFECTS⊗* 14%  Which other BEINGs might be called by this BEING, and why?
⊗2COMPLEXITY⊗* 92%  A vector of utility measures, including ease of calling, chance of success, chance
      of (calling)* itself, expected time cost, efficiency of its output results.
⊗2SPECIALIZATIONS⊗* 40%  How to write a more streamlined, special-case version of this BEING.
⊗2GENERALIZATIONS⊗* 27%  What are some other BEINGs, encompassing this one?
⊗2ALTERNATIVES⊗* 16%  What are some equivalent BEINGs? (to try if this one fails)
⊗2RESULTS⊗* 15%  How many values does this return? What domain is each in? Any side effects?
⊗2STRUCTURE⊗* 4% If this is ever viewed as a data structure, what operations are done to it, and how?
⊗2ENCODABLE⊗* 9%  How to order the evaluation of the other organs, when writing a new version.
⊗2FORM-CHANGING⊗* 1% Should the entire structure of an existing BEING be radically altered?
⊗2INHIBIT-DEMONS⊗*  5%  A lock/unlock mechanism, useful when handling demonic interrupts.
.END

	The operational details of each organ will be explained by dissecting
OBTAIN-USABLE-INFORMATION,
a high-level, domain-independent BEING. For each organ is listed its 
⊗2NAME⊗*, ⊗3a typical
question which it would answer⊗*, ⊗2its answer⊗*, 
⊗3the LISP program stored as the value
of that organ in OBTAIN-USABLE-INFORMATION⊗* (the way it computed the answer), and
a brief discussion.

.BB
⊗2IDEN⊗*  	 Can you recognize this: "Find out more about frobs"?  		⊗2Yes. Call me with argument "frobs".⊗*
   ((if you see: (OBTAIN-USABLE-INFORMATION any1)
		 then return: (OBTAIN-USABLE-INFORMATION (TRANSLATE any1)))
    (if you see:(FIND OUT MORE ABOUT any1)
		 then return: (OBTAIN-USABLE-INFORMATION any1)))
.E
Each BEING is given the duty to recognize and process English phrases
related to him.  In practice, only a few simple phrase templates were
needed for each BEING; OBTAIN-USABLE-INFORMATION only used two patterns.
All the IDEN organs are collected together into one table.  When a form
⊗4F⊗* must be translated, the TRANSLATE BEING takes control and (by
searching this table) asks "who can recognize ⊗4F⊗*?"  Each IDEN organ
runs a little program (typically a pattern-match), then replies either
NO, or else provides a little program whose value should be the translation
of ⊗4F⊗*.  If there is more than one responder, a global choice BEING,
CHOOSE-FROM, selects the winner.  Often, as in the first case for this
BEING, the program which is
returned calls TRANSLATE recursively on some organs
of ⊗4F⊗*.

.BB
⊗2EXPLICIT-ARGS⊗*	 What arguments do you take?  		⊗2The single argument U⊗*. 	(U)
.E
The argument to this BEING is never operated upon, hence there is no need to
understand anything about it.  It is simply passed to BEINGs called by
OBTAIN-USABLE-INFORMATION.

.BB
⊗2WHAT⊗*        Basically, what do you do?     ⊗2I obtain some information which can be used.⊗*
⊗2HOW⊗*		 How do you do this? 		⊗2Either obtain new facts about old information, or 
								else obtain totally new information.⊗*
⊗2WHY⊗*		 Why do you do this? 		⊗2PUP6 has no more information that is immediately usable⊗*.
.E
These three organs are primarily for the user's benerfit. In the current
case, they don't even have any uninstantiated organs.  If the user asks one
of these questions when this BEING is in control, then the appropriate
answer can simply be typed out.

.BB
⊗2WHEN⊗*	 How badly do you want control now, and why? 	  ⊗2Overall rating is high because
							        	there are 5 pieces of new information.⊗*
((if T then add in -10 because (THIS IS AN EXPONENTIALLY-GROWING, BAD THING TO DO IN GENERAL))
 (if NEW-INFO-LIST then add in  (PLUS  100  (LENGTH  NEW-INFO-LIST))
             	because (WE SHOULD WORK ON UNASSIMILATED NEW  INFORMATION IF THERE IS ANY)))
.E
The WHEN organ of a BEING is a collection of triples: if <predicate> then
<value> because <reason>.  If the <predicate> evaluates to non-null, then
the <value> program is executed. It returns a number, which is then added
together with the other factors' numbers to produce a rough estimate of
how a propos this BEING is to take control right now.  The <reason>
evaluates to an English phrase, for the benefit of inquisitive users.
This linear scheme is undesirable but (sigh) adequate. The first factor
here says to always add in the number -10; the second says
if there is some new information sitting around unexploited, to add in 100
plus the number of such pieces.
These factors and their weights, like the contents of all the organs
of all the BEINGs initially in PUP6, were decided upon and inserted by
hand.

.BB
⊗2META-CODE⊗*	   What do you do specifically to achieve your results? 	⊗2I have four specific BEINGs
									 decide among themselves who goes next.⊗*
(DO	 (CHOOSE-FROM 		(GET-NEW-INFORMATION U)
				       (TRANSLATE U)
				       (ANALYZE-IMPLICATIONS U)
		         	       (EXTRACT-RELEVANT-SUBSET U))
    BECAUSE	 (WE CAN ONLY TRY TO OBTAIN USABLE INFORMATION IN ONE WAY AT A TIME))
.E
The   META-CODE  says  to  pass control to one of these  four BEINGs:
Get-Brand-New-Information  (in  miniscule subset of 
English,  from  the  user), Translate
something    (transform     from     restricted English     to     BEING-calls),
Analyze-The-Implications  (of some existing, translated information),
Extract-a-Relevant-Subset   (of   the   existing   information)    to
concentrate upon.
The goal mechanism, SATISFY,
is usually employed when the BEING doesn't know exactly
which other BEINGs to call. If the ⊗4set⊗* of possible choices is known, as
in this case, then CHOOSE-FROM may be called with the explicit choice set.

.BB
⊗2MAIN-EFFECTS⊗*	 Are you relevant to making the user aware of the time? 	⊗2No.⊗*
 ((to get (NEW INFO any1)      do (OBTAIN-USABLE-INFORMATION any1))
  (to get (USABLE INFO any1)  do (OBTAIN-USABLE-INFORMATION any1)))
.E
	The EFFECTS organs of each BEING are collected  into
one  table  to  facilitate  asking all BEINGs simultaneously "Can you
cause effect X to occur?" Each EFFECTS organ examines X and  the  world,  and
either  replies No, or else returns a little program (involving calls
and constants) which  should  produce  the  effect.
This program generally will involve a call to the BEING
itself. OBTAIN-USABLE-INFORMATION shows
that it should be called to acheive
the existence of new or usable information.

.BB
⊗2AFFECTS⊗*	 	Will you call TRANSLATE? 		⊗2Possibly.⊗*
       ((CHOOSE-FROM IS CALLED)
	(some BEING who can cause (AWARE USER (ABOUT TO OBTAIN USABLE INFO)) is called)
        (TRANSLATE POSSIBLY IS CALLED)
        (GET-NEW-INFORMATION POSSIBLY IS CALLED)
        (ANALYZE-IMPLICATIONS POSSIBLY IS CALLED)
        (EXTRACT-RELEVANT-SUBSET POSSIBLY IS CALLED) )


⊗2COMPLEXITY-VECTOR⊗*	 How difficult are you to call? 	⊗2Average.⊗*  	   (.5 .5 .5 .1) 
.E
The first component says that OBTAIN-USABLE-INFORMATION is of average
difficulty to call. Next, there exists a .5 chance that some descendant
will call it again. The third component indicates that the cost of trying
this BEING is typical. Finally, there is no good reason for inhibiting
it ever.  In general, each component can be a ⊗4program⊗*,
not just a constant.

.BB
⊗2GENERALIZATIONS⊗*	 What BEINGs are more general than you are? 	⊗2WRITE-PROGRAM
					and SERVE-THE-USER⊗*.   	(WRITE-PROGRAM SERVE-THE-USER)
⊗2ALTERNATIVES⊗*	 In case you fail, who else might I try? 	⊗2Try USE-INFORMATION,
					FIX-INCORRECT-PIECE, OPTIMIZE, or FILL-IN-UNDEFINED-SECTION)⊗*
.E
If this BEING fails, try the alterntives; if they fail, consider trying the
more general BEINGs.

To illustrate a few of the other BEING organs, consider PARTITION-A-DOMAIN,
a high-level,
domain-specific BEING.
Its
basic idea is to divide up a space into  categories.  Only  two
BEING   organs   are   filled   in   here   which   were  absent  from
OBTAIN-USABLE-INFORMATION:  SPECIALIZATIONS  and  DEMONS.  
	The
SPECIALIZATIONS organ says that to write a specific version of itself,
a few decisions must be made. The results  will  then  indicate  what
pieces  of  code  get  assembled together to form the new BEING.  The
partition may be only partial (an element of the domain belongs to no
class  of  the partition), only weak (an element belongs to more than
one class), and, most importantly, the specialized partitioning
BEING should  work
by  repeatedly doing some of the following three activities: accept a
class-name from the user and guess some of its elements, accept an 
element from the user and try to guess which class it belongs to,
or accept an <element, class-name> pair. Other BEINGs know
about these accepting, guessing, and checking activities.
	The  DEMONS  organ says that during the partitioning, the only
new demon to keep active is the one  called  Fringe-of-Consciousness.
This deserves elaboration. In  the  course  of
writing  GI,  the  system  learns  that  the  main  task  is  one  of
grammatical inference.  The GRAMMAR BEING then  asserts
that  if  the  user ever mentions the word ⊗4test⊗*, he may actually mean
⊗4parse⊗*.   This is placed in the IDEN organ of the TEST  BEING,  and  is
therefore   only  demonized,  waiting  on  the  periphery  of  PUP6's
concentration.   It is in fact triggered later in the dialogue, as an
inference surprising to the user.


.B

.E
.ONCE CENTER
⊗5↓_6. Control in the system_↓⊗*

	The BEINGs control themselves in a simple way. A very general
BEING, SERVE-THE-USER, has control initially. In general, some  BEING
⊗4B⊗*  will  be  in  control.   Its  organs are assembled into an
executable function, and it is run.  During the course of its  reign,
⊗4B⊗*  will  want things done and/or tested which it cannot manage by
itself.  This corresponds to when a normal program needs  to  call  a
subroutine.  What ⊗4B⊗* does is to provide SATISFY, the goal mechanism,
with a description of what is wanted.  SATISFY asks  the
entire  BEING  pool  "who can do this thing?", and collects a list of
the responders.  CHOOSE-FROM is then called, and asks each responder
why he
should be allowed to go now (the WHEN organ answers this) and how costly he will be
if he does go (the COMPLEXITY organ). The responders are then
passed control, in order, until one succeeds or all have failed.
In addition to these goal  statements,  there  exists  a
"world" consisting of assertions and variables with values. These are
regarded as trivial cases of BEINGs, possessing only  one organ.   So
⊗4B⊗*  may also query this data base by saying "what assertions match
this one...", or by asking "what is the value of...".  These  actions
can be programmed in a much more efficient manner than as BEINGs.
They never employ nondeterministic transfers, hence are faster by
an amount proportional to the total number of BEINGs in the system.

The way a BEING's organs fit together
is uniform over all the BEINGs at all times. Thus one simple function
(an  assembled BEING) assembles all the BEINGs initially into LISP
functions.  The precise algorithm for doing this is now discussed.
First, the explicit arguments (those global to the
BEING)  are  bound. The implicit arguments (those local to the BEING,
like PROG variables) are initialized. The name of the BEING is pushed
onto  the  BEING  control  stack, and any
newly-activated  demons  are  pushed  onto  the  demon  stack.   The
ARGS-CHECK  predicates  are  evaluated.  Then PUP6 works to make each
PRE-REQUISITE true in the world.  Each COMMENT is evaluated, then the
CO-REQUISITES,  META-CODE,  and  current  demons  all are executed in
pseudo-parallel.  Each POST-REQUISITE is then satisfied.  These
activities are all embedded in an AND, so any of them can cause the
entire BEING to halt and fail, simply by returning NIL.
Just  before  exiting,  the  demon and BEING
stacks are  popped,
and control passes to the delegated BEING.
A fancy context
mechanism was available but not actually needed.
	The function which assembled all the BEINGs exploited the
fact that it operated only at  system load time. Thus if the BEING 
had no new DEMONs to activate, all the corresponding demon-stack
manipulations could be omitted. 
Similarly, no CO-REQUISITES means no parallel processing need be
simulated. Optimizations like these are clear
from comparing the organs of OBTAIN-USABLE-INFORMATION, the algorithm
just described, and the actual functional version produced:

.BEGIN NOFILL SELECT 3 GROUP

(OBTAIN-USABLE-INFORMATION
  (LAMBDA (U)
     (AND 
         (PUSH     OBTAIN-USABLE-INFORMATION      BEING-STACK)
         (PUT OBTAIN-USABLE-INFORMATION SPEC-WHY BECAUSE)
         (EVERY    CURRENT-DEMONS     APPLY*)
         (SETQQ    BECAUSE    (WE CAN ONLY TRY TO OBTAIN USABLE INFORMATION IN ONE WAY AT A TIME))
         (CHOOSE-FROM 
		 (GET-NEW-INFORMATION U)
	         (TRANSLATE U)
		 (ANALYZE-IMPLICATIONS U)
		 (EXTRACT-RELEVANT-SUBSET U)))
      (POP     BEING-STACK)))
.END
	Each BEING has the responsibility to set the value of the variable
BECAUSE just prior to calling another BEING. When a BEING takes control, it
immediately stores the current value of BECAUSE as a special addition to its
WHY organ. Thus each BEING can provide two reasons: one supplied in general,
and one set by the particular BEING which calls it. Above,
OBTAIN-USABLE-INFORMATION sets a reason for calling CHOOSE-FROM.
Notice that many organs can answer queries, but play no role once the BEING gains
control.

.B

.E
.ONCE CENTER
⊗5↓_7. Theoretical aspects of PUP6_↓⊗*

	For aesthetic reasons, an effort was made to keep the languages for
PUP6 and CF the same. This meant PUP6 must write CF as a family of BEINGs.
Consequently ⊗4some⊗* BEING(s)
must write new BEINGs. The significant design issue here is
that the BEING  who  knows  about
⊗4X⊗*  takes  charge  of  generating  new BEINGs  relating to ⊗4X⊗*.  For
example,  the  INSERT  BEING  can  do  inserting  by  one  of  a  few
algorithms,  and  he  can  also  write (by specializing himself) more
specific  insert  routines,  and  he  can  answer   questions   about
inserting.

This idea is analogous to any reliance upon experts. The
SPECIALIZATIONS and ENCODABLE organs handle this chore.
An unusual consequence is that the synthesized code  will  have  all
the ⊗4types⊗* of abilities that any BEING community has:      it
can modify itself according to the user's
complaints  and  answer questions about what it is doing as it runs.
The CF program generated cannot, of course, write the full complement  of
programs PUP6 could write.
	In a similar vein, recall that ⊗4each⊗* BEING must do the  translation
of  the  users' relevant quasi-English inputs into BEING-usable form. 
For example, our  INSERT  BEING  must  also  recognize  and
process phrases involving inserting, stack-pushing, and merging.
	PUP6 does not operate within the
exploratory anthropomorphic paradigm of programming 
(ignore details,  catch  them
later by debugging).  As in all programming, 
decisions continually crop up which
the system is not able to resolve  at  the  time. PUP6 always tries to
defer the decision as
long as possible.  When, at last, no progress can be made without its
resolution,  and  if the system is still unsure, then either the user
settles the question or else a backtrack point is reluctantly set up.
But  often,  by  this  time,  some  new  information is present which
enables the system to resolve the decision, thus reducing the  amount
of  backtracking.   If  there  are two or more decisions which can no
longer be deferred, the system tackles the easiest one first.
	It was hoped that most of the  carelessness
bugs could be eliminated through this deferral, feed-forward, and very
precise  record-keeping.  Humans  depend  on  their  adaptability  to
compensate  for limitations in their brain hardware
[8] but there is no
need for an ⊗4automatic⊗* programming system to do so.  For  example,
when  a  list  structure  is  first  encountered,  the system records
warnings that it is undefined, no accesses, insertions, or  deletions
have  been  observed,  and  so  on.  Each such worry is weighted, and
periodically the system resolves such  warning  notes  --  especially
near the completion of the target program.
	The  new  BEINGs  created
during  the  execution of a specified task are kept distinct from the
original set  of  BEINGs. A ⊗4program⊗*  for  task  ⊗4T⊗*  is
accomplished  by doing ⊗4T⊗* and then dumping the new BEINGs out onto
a new file.  The entire old  BEINGs  pool  is  then  treated  as  the
language  supporting this file. As a corollary, asking a BEINGs-based
system to write any subset of itself is the null task.
	Most  programmers  intentionally augment their code to aid in
later debugging, modification, and readability by humans.  These aids
are  typically  comments,  summaries,  over-generalized  subroutines,
optional print statements,  and  runtime  statistics.  Recently, the
structure  of  the  program  itself has also been
recognized as a powerful
manipulable element, affecting the accessability of the code, not just
its length or running time.
The length of any program written as a pool  of
BEINGs  is several times as long as a clean hand-coded version.  This
extra code, the organs of each new BEING which are superfluous, may be
viewed as organized self-documentation.  They should in theory improve the
debugging, expansion, and  accessibility  (to  alien  users)  of  the
synthesized code.
	Some assertions are so meaningful  to  the  user,
that they are stored in the code itself, even if they are
only temporary.  At any time, the user
may look at a piece of code; the comments present  ⊗4then⊗* are  all
assertions pertaining to that piece of code at that time.
BEINGS may use comments at  several  different  levels.  At  the
lowest  level,  each  comment  is  merely  a  unique token; it may be
inserted, removed, or searched  for.   Slightly  better,  the  BEINGs
system can ask "is there a comment involving ...". For this purpose, a
constrained syntax for the comments should be developed. Otherwise
(as, unfortunately in PUP6) each new comment must be attended to
ad hoc.

.B

.E
.ONCE CENTER
⊗5↓_8. Ideal and real systems_↓⊗*

	When imagining an ideal AP  (automatic  programming)  system,
one  ability  we  might  wish  for  is  that  it  be  able to input a
well-documented program, glean pure programming knowledge from it,
and  transform  this  into  a  format  it  can use.  Also, to improve
itself, it should be capable of digesting a sample, valid AP  dialog,
and (perhaps through additional user interaction) modify itself so it
could actually carry on that  dialog.  
Another ideal to hope for is that the user be permitted to do whatever
he can whenever he can; if he suddenly thinks of a stray caution, the
AP system should accept it (e.g., "Oh, I might want to type in 
HALT instead of STOP, sometimes").
BEINGs were not designed with
these goals in mind, and as a result they may be an
insufficient framework for achieving them.
By studying the difficulties of PUP6,
isolating those due to poor programming from those due to
poor ideas, enough may be learned to build the next system one
qualitative step closer to this ideal.
It is in this spirit that BEINGs
are now contrasted against  the  concepts  of  ACTORs, CLASSes,
FRAMES, HACKER, formal AP systems, and macro expansion schemes.
	ACTORS [5]  provided  the  key  concept  of  integrating
uniformity  of  construction  with  sophistication of behavior.
Each ACTOR module has its own, unique organs, hence must be (implicitly)
aware of all the organs (message formats)
of all the  ACTORs it ever is going to shake hands with. 
Adding a new module is thus conceptually intricate as
well as practically difficult. CLASSes have a few standard
organs, and the modules are arranged in groups, each of which has its
own additional types of organs. Thus a module can ask ⊗4any⊗* other module
one of a few universal questions, and it can ask any module in its group
an additional set of questions. If it is permitted to know about other
groups' special organs, then the same adding problem recurs. If it is
denied such knowledge, it cannot access much of the knowledge in the
pool.  Yet if one requires a completely universal set of message types, then
most of them will be irrelevant to most modules. This is the price which
BEINGs pay.  This superfluity is real and almost
intolerable.
	FRAMES [6] seems superficially similar  to  BEINGs,  and
are so amorphous that they actually could subsume BEINGs. There is a
deep difference of philosophy and of intended usage, however.
FRAMES intentionally have default values already (with no processing)
filled in  for each frame, and replaced only when  necessary.
This  is  akin  to  automatic  programming by blind debugging, but is
antithetical to the fastidious bookkeeping BEINGs philosophy.  As  an
example,   consider   the   writing   of  a  short,  recursive
LISP program (reverse, flatten, factorial,  etc.)
A human, consulting the proper frame in his head, knows to ignore the
base step for a while, then fill it in, usually  as  ⊗4if  NIL,  then
NIL⊗*.  The
human makes this default synthesis, tries out the program, and  looks
closer  (to fill in this "slot" properly) only if something calls his
attention to it (such as a LISP error message:
NON-NUMERIC ARG  ⊗4NIL⊗*, e.g., if what was really needed was the base
step ⊗4if NIL, then 0⊗*).
A BEINGs system would
also  defer working on the base step, but could -- and would -- place
a note about that deferral into the assertional warning base.  Before
thinking  it  was  finished,  the  system  would  attend to that note
carefully. This is not a criticism:   
FRAMES are  meant  to 
model  perception, BEINGS are not.
	The contrast with HACKER is similar to that of FRAMES.
The orientation of HACKER is to put together pieces of programs
into something close to the final desired target, then  try  and  run
it.  When  errors result, they are analyzed and then patched.  PUP6
is oriented toward writing bug-free code; what it writes must  always
coincide  with its internal specifications of that code. The user may
later change his mind, and the system should be (and occasionally
is) able to modify its
own  programs,  but  this  process  is much better defined than blind
debugging. On the other  hand,  if  PUP6 does  have  an
unexpected bug in it, it may be fatal. One
way to overcome this would be to include special recover-from-bugs
BEINGs.
	The  formal  manipulation  systems which also synthesize code
are so different from BEINGs, that it is difficult to  compare  them.
These  logical systems (e.g., Luckham's) attack an entirely different
problem.  Their task is specified rigorously  in  advance,  and  they
proceed  formally  to  search  for  a  solution  program.  PUP6 is
designed to work from a much more
ambiguous specification, heuristically.  Each  action
is  implicitly justified by the fact that the user can later describe
to the system
what is undesirable about the program it's generated. PUP6  should
thus  be  tolerant of user ambiguity and error.
	Like  ⊗4any⊗* AP system which writes structured programs, the
external behavior  of  PUP6 is
reminiscent  of  macro  expansion regardless of ⊗4what⊗* the internal
BEING  interactions  are.  One  major  distinction  between  the  two
concepts is when and how the macros are expanded. Traditional answers
are:  at  compile  time,  by  rigid   substitutions.
BEINGs  systems expand their "macros" at
times they choose, using knowledge gleaned from each other  and  from
the  user. For example, consider a series of macro calls occurring in
the code: (m1)(m2)(m3). A macro expander could expand these in any order,
and the three activities could go on simultaneously, with no interference
or change in the final code. BEINGs would try to find some task common
to all three calls, and work on that first. The order of which to
first expand fully would be carefully weighed, 
and during that expansion new
information would be learned which could affect the expansions of the
other two function calls. The final code would not simply be the APPEND
of the expansions of m1, m2, m3, as it would with a macro scheme.


.B

.E
.ONCE CENTER
⊗5↓_9. Questions for AP systems_↓⊗*

	The success of an AI effort
can be assessed only where accepted standards of intelligence exist.
The  character of the dialogue (described in the next section)
is, of course, one important measure of the  intelligence
of  an  AP (automatic programming)
system.   One which asked "What is the first instruction?
What is the second...?" would be universal but worse than useless.
This  section  proposes some   questions  one  should  ask  when
confronted by a new AP system 
which generates code heuristically from a dialogue.
Capsule answers pertaining to PUP6 are parenthesized after each question;
fuller answers are located elsewhere.
	The questions break into two categories. First, one wants  to
know  what the system does.  Most important, does it synthesize a program?
(yes.)
If  so,  how  complex  is  that  task,  and  what  is   the   program
specification  like? (a concept formation program, several pages long,
from one specific protocol dialogue).  
How precise must the user be:    may he err (no),
forget (no), change his mind? (in limited cases.) 
How detailed must his  dialogue  be? (by construction, just about as
detailed as if he were talking to a CS grad student programmer.) How
closely  does  the  dialogue determine the details of the final code?
(much of the 
final code is clearly determined by the precise user responses.)
How does the user know where he is during the dialogue? (simulated cursors
pointing to a graph representing synthesized code.)
Does the user ever get lost? (Often.)
	Quite  seriously, an important question is whether the system
writes a second program. (yes.)  
If so, how does it relate to the first  one
synthesized? (both are inductive inference LISP programs.) 
How much of the AP system is actually used in writing
⊗4both⊗* programs? (49 BEINGs out of 79.) 
One should ask what the full range of programs is
which  the system has successfully synthesized. (the concept former, the
grammatical inferrer, and the simple property list manipulation system.)
The dual questions to
these inquire whether a program may be generated by several different
(in the sense of rephrasing and in the sense of reordering the subtasks)
dialogues (theoretically, but not practically),  
and  what  the  range  of variability is. (theoretically large, but
practically quite limited.) Similarly, what
about the target language: is it a parameter? (no, always the same.) 
How does it  relate  to
the language the system is written in? (both written as BEINGs in 
INTERLISP.)
	Does the system modify as well as write code? (yes, but not
as its major mechanism.)   If so,  can
it  modify  anybody's,  or only those programs it has written itself?
(only its own, and only in simple ways.)
What is its "style" of programming? (many supplementary comments,
fairly clean structured programs written as BEINGs.)
One also wishes to know the  size
of  the tool as well as the size of the job: How does the system size
compare to target program size? (one order of magnitude larger.)
	Efficiency considerations are often the crucial
pragmatic ones. Economics and current hardware restrict the amount of
computation  which  may be done in toto; the expected lifetime of the
universe restricts us from using the  most "elegant" algorithms;  and
human  patience  restricts the interaction delay time (to a minute or
two of real time). One must therefore  ask  things  like
how  much  time  and  space  it  requires,  how long a delay there is
between user inputs, etc. (one CPU hour, 100K,  ≤1 CPU minute 
between responses, compiled INTERLISP run on a PDP-10 TENEX system.)
	The  second  class of questions concerns the theoretical side
of the AP system.  What new ideas is it meant to experiment with? 
(using BEINGs to write large programs).
How
do these ideas fare? (mixed results; this is the subject of
most of the remainder of this paper).
Does it write its code in the right way?
(it asks the right questions of  the  user  at  just  the  proper
times with respect to the original protocol.)
How extendable is it: how difficult is it to add new pieces
of knowledge and to modify old ones? (theoretically easy, practically it
requires familiarity with the system.)  Could  it  write  programs  in
various fields (possibly), in various styles (unlikely), in various sizes?
(the three target programs differ by less than one order of magnitude.)
	How is knowledge  represented  internally? (BEINGs, assertions,
demons.)   Is  any  of  it
duplicated? (not much above the LISP language level; some universal
knowledge is factored out; e.g., how to CHOOSE-FROM a set of BEINGs.)
If so, why? (too primitive to factor it out; e.g.,
how to bind variables.) 


.B

.E
.ONCE CENTER
⊗5↓_10. Examples from the dialogue which results in CF_↓⊗*

The dialogue to synthesize the concept formation
program (CF) begins by PUP6 asking the
user  for  any  task.   The user replies, ⊗4Write a program which does
concept formation⊗*. There are many decisions that PUP6 knows about in
writing  a  specialized  concept formation program [4], 
and it was given just enough concept formation knowledge to
defer each.   For example, it must choose between classificatory,
comparative,  and metrical concept formation. Since all three involve
partitioning a domain of examples, PUP6 decides  it  can  defer  this
choice  until  after code for the partitioning is synthesized.    The
effort of the system is now  directed  toward  this  "piece"  of  the
program.  When it is completed, an hour or two later, the system asks
the user to make this decision. When he responds  ⊗4CLASSIFICATORY⊗*,
which  involves ⊗4only⊗* partitioning,  PUP6  prints that it has
finished the entire CF task, and dumps the newly created BEINGs  onto
a file.

	An example of the PUP6 community of BEINGs interacting will now
be presented.
When a BEING is said to do or recognize or know something, as
in the following paragraphs, what is actually meant  is
that  one or more of its organs specificly encode that knowledge.  All
the organs of the hundred BEINGs in PUP6 were coded by  hand,  not  by
each  other.  They then  encoded other BEINGs, interacting only via
the dialogue.
	Let's jump one third of the way into the dialogue which results in
creating  CF. The system learns it must compare the input scene against
its internally stored models of concepts, until it  finds  one  which
passes the  comparison.   It  asks the user what it means for
scene S to fail the comparison to model M. The user  replies,
whenever M  is  incompatible  with  the  observed  features  of  S.
The IDEN organ of the
CONTRADICTS BEING recognizes this sentence pattern, and transforms it
to 

(FORSOME  F  IN M-FEATURES   (⊗4specialized-contradicts⊗*   F   S-FEATURES)).

The BEING
which inserts this into the synthesized code requires that  the  user
pick  some  proper  name  for  the new BEING ⊗4specialized-contradicts⊗*;
this will be a streamlined contradiction test predicate written by the
CONTRADICTS BEING.  Say the user reponds by calling it IMPOSS. This naming
and specializing is central to BEING creation: a BEING recognizes an
instance of itself, and decides either to insert a call to itself or
else to insert a call to a specialized version of itself.  If any
nontrivial decisions must be made, and can be settled at synthesis
time, then the latter alternative is chosen.  CONTRADICTS is too 
general a BEING to be called in an inner loop, so its content will
be hammered out at synthesis time. On the other hand,
FORSOME is so primitive that no  specialized
version  of  it  is  written normally. 
	Here is the way this piece of the dialogue actually appears.
The user's reponses are italicized.
.BEGIN NOFILL FLUSH LEFT

PUP: Please type in a logical expression which is true iff we terminate the loop
USER: ⊗4Any feature in model-features is incompatible with scene-features⊗*
PUP: PUP wants USER to type in name for specialized version of CONTRADICTS
USER: ⊗4IMPOSS⊗*
.END
	Later in the  dialogue,  PUP6  decides  it  must
expand the function IMPOSS. The CONTRADICTS BEING is again called on; this
time  it  is  asked  how  to  write  a  specialized  version   of   a
contradiction  test.  Its SPECIALIZATIONS organ
replies that ⊗4some of⊗* the following types of
contradiction  may  occur:        PROBABILITY=0,  PROBABILITY=1,  and
PROBABILITY⊗4ε⊗*(0,1).    This   is   the  germ  of  the  idea  for  the
MUSTNOT/MUST/MAY categorization of features. The SOME-OF BEING  takes
control,  and  asks  if  the  decision can be deferred.
Eventually
the
DEFERRAL BEING reports failure.  SOME-OF sees it has  no  basis  upon
which  to  form  a  guess,  and  wants somebody to ask the user.  The
ASK-USER BEING takes
over.
The names of the three choices are so cryptic that
their WHAT and HOW organs are printed out to  the  user,  as  well  as
their names.  The user types back his choices, in our case all three.
SOME-OF regains control and  composes  
three  new  BEING  calls,   separated   by   a
conditional:
.BEGIN GROUP NOFILL SELECT 3 INDENT 0

(COND  
 ((IS-OF-TYPE  F  PROBABILITY=0-PART-OF-M-FEATURES) (PROBABILITY=0-CONTRADICTION  F  S-FEATURES))
 ((IS-OF-TYPE  F  PROBABILITY=1-PART-OF-M-FEATURES) (PROBABILITY=1-CONTRADICTION  F  S-FEATURES))
 ((IS-OF-TYPE  F  PROBABILITY⊗4ε⊗*(0,1)-PART-OF-M-FEATURES) (PROBABILITY⊗4ε⊗*(0,1)-CONTRADICTION  F  S-FEATURES)))
.END
TRICHOTOMY recognizes this as a split on a function's  values
being  =0,  =1,  or  between  zero  and  one.   It  asks whether this
particular  function  can  only  range  over  the   interval   [0,1].
PROBABILITY  answers  affirmatively,  so  SOME-OF  replaces the final
IS-OF-TYPE test by the constant T.
That is, any feature not passing one of the first two IS-OF-TYPE tests ⊗4must⊗*
pass the final one, so the final one is unnecessary.
	The three contradiction-checking predicates
know (their ENCODABLE  organs  tell  them)  that
they  are  immediately encodable. A probability=0 contradiction means
that arg1 must not occur in the  world  arg2  yet  it  does.  So  the
META-CODE  of  the  PROBABILITY=0-CONTRADICTION  BEING  is defined as
(MEMBER arg1 arg2).  This corresponds to a MUSTNOT feature F which is
present  in  the  world  (in  
the  set  S-FEATURES, features of the scene). A
probability=1 contradiction occurs when a feature arg1 must occur  in
the  world  arg2,  yet  it  doesn't. So its definition is simply (NOT
(MEMBER arg1 arg2)).  It is impossible for a feature with probability
in  (0,1)  to  be  in  contradiction  with any world  lacking negated
features, so this third predicate is defined as  the  constant  NIL.
That  is,  if F is a MAY feature of the model M, then there can be no
contradiction between F and ⊗4any⊗* features of the scene S.
	The  IS-OF-TYPE  BEING  recognizes  that it must work hard to
fll in organs for each of the two case tests, unless  M-FEATUiRES  is  organized
permanently  into  three  parts (thus permitting the complex calls
"IS-OF-TYPE" to be replaced by the simple ones "MEMBER" in  the  above
piece of code). The STRUCTURE-INDUCING BEING takes over, probing
the permissability and the difficulty of such a constraint.  It finds
nothing  about  this  tripartite  structuring  which could damage any
earlier synthesized code, and asks the  user's  blessing.  Demons are
created, stating that any reference to
the entire list of M-FEATURES must be replaced by  either  APPEND  of
the  three  new lists, or else by three separate statements. GET-NAME
is indirectly called, and he has the user name the three new sets  of
features; say he responds by calling them MUSTNOT, MUST, and MAY. The ENCODE
BEING says to draw an arrow on its graph of newly
created BEINGs, going
from the BEING call (IMPOSS F S-FEATURES) to the new block of code:
.BEGIN GROUP NOFILL INDENT 0 SELECT 3

  (COND  ((MEMBER  F  MUSTNOT-PART-OF-M)  (MEMBER  F  S-FEATURES))
	     ((MEMBER  F  MUST-PART-OF-M)     (NOT (MEMBER  F  S-FEATURES)))
	     ( T  (COMMENT THIS "T" REPLACES "MEMBER F MAY-PART-OF-M")  NIL))
.END
	This is now the META-CODE organ of the new BEING called IMPOSS.
Most of its other organs are cloned from its generalization, namely
CONTRADICTS. During the course of writing this piece, however, some
of these organs should be slightly changed. For example, IMPOSS's reason
for existing should be strengthened when the simple MEMBER function
calls replaced the slow IS-OF-TYPE  BEING calls.
	Most of this processing is inter-BEING activity; the user is
not needed for -- or even informed of -- most of it.
Alas, "the user" is  not  generic;  there
was  only one successful user.


.B

.E
.ONCE CENTER
⊗5↓_11. Excerpt from the synthesized program itself running_↓⊗*

The PUP6 system interacts with the user, resulting in the creation of
CF, the desired target program.  Like PUP6, CF is nothing but a pool of
BEINGs. So, as it runs, the user can interrupt it and ask it questions
about what it's doing. The answers are typically just some organs of the
BEING currently in control.
Any BEING on the control stack will provide fully instantiated
answers to any of its 29 allowable queries; all other BEINGs will provide only
hypothetical answers.
An excerpt of CF running without interruption
is given below, followed by the same piece with interruptions. 
The characters of the dialogues with PUP6 to write CF, and with CF to
learn concepts, are similar; an "interrupted" excerpt of the user-PUP6
dialogue will be omitted.
Some liberty has been taken with the
English (e.g., ⊗4WHAT?⊗*, not ⊗4What are you doing?⊗*).
CF types in boldface, the user in italics.
.BEGIN NOFILL FLUSH LEFT SELECT 2

READY TO ACCEPT BRAND NEW SCENE:   ⊗4(??? (A B) (Block A) (Green B) (Touches A B))⊗*
NOT AN ARCH. NOT A TOWER. MIGHT BE A ROW.  MY GUESS IS ROW.
WHAT IS IT REALLY?		   ⊗4Nothing⊗*
TRANSFER FROM MAY TO MUST SETS OF ROW: (TOUCH A B)
READY TO ACCEPT BRAND NEW SCENE:
.END

Here is the ⊗4same⊗* excerpt, but with the user interrupting occasionally.
"⊗4U:⊗*"
signifies he has interrupted,
and "⊗4OK.⊗*" signifies he has told CF "⊗4OK, go back to work now.⊗*"

.BEGIN NOFILL FLUSH LEFT SELECT 2

	⊗4U:  What are you doing?⊗*
CF:  Divide a domain into subdomains, using specialized version of PARTITION-DOMAIN
  	⊗4How?⊗*
CF:  By building up the partition gradually, by adding one <scene, name> pair.  ⊗4OK.⊗*

READY TO ACCEPT BRAND NEW SCENE:   ⊗4(??? (A B) (Block A) (Green B) (Touches A B))⊗*

	⊗4U:  What are you doing?⊗*
CF:  Break NAME piece off (??? (A B) (BLOCK A) (GREEN B) (TOUCHES A B)) to examine.
	⊗4What code is the current BEING going to execute?⊗*
CF:  The meta-code of SOME-PART-OF↓1↓4 is (CAR arg1). Evals to ???. ⊗4OK.⊗*

NOT AN ARCH.
	⊗4U:  Why?⊗*
CF: The ARCH MUST-relation (SUPPORTS A B) is contradicted in the SCENE. ⊗4OK.⊗*

NOT A TOWER. MIGHT BE A ROW.  MY GUESS IS ROW.
WHAT IS IT REALLY?		   ⊗4U:  Why are you asking?⊗*
CF: CF cannot resolve it, and must at this time, so CF asks USER
	⊗4OK. What you wanted is:   Nothing⊗*

	⊗4U:  What are you doing?⊗*
CF: Go through space of concept names, and make user aware of changes.
	⊗4What BEINGs might this affect?⊗*
CF: MESSAGE is possibly called; some version of TEST is possibly called.  ⊗4OK.⊗*

TRANSFER FROM MAY TO MUST SETS OF ROW: (TOUCH A B)
READY TO ACCEPT BRAND NEW SCENE:

.END
.B

.E
.ONCE CENTER
⊗5↓_12. Other Tasks_↓⊗*

	Let  us  briefly  describe  GI  and  PL, the second and
third programs PUP6 synthesized.   GI  is  a  simple
grammatical inference program. It builds up a set of plausible  rules,
rather  than  enumerating  through  the space of all grammars. Yet it
lacks most of the "tricks" of  the  best  g.i.  programs.   The  user
repeatedly specifies a string, labelled LEGAL, ILLEGAL, or ??. In the
latter case, GI must print out its guess and accept the correct label
from  the  user.   In  all  three  cases,  it  must update its set of
plausible rules to be at  least  consistent  with  all  positive  and
negative  instances  observed. In some versions of GI which the user coaxes
PUP6 to generate, a parse tree may be maintained  and  inspected;  in
other versions, just a list of the rules used during a parse is kept.
	PL is a data-retrieval-on-keys system, which maintains a
property list structure.
The user repeatedly types
in a request to insert, delete, or inspect a  record  (e.g.,  INSERT:
PASSENGER  Jones  FLIGHT  741  FROM SFO TO LAX CARRIER TWA LEAVE 7:15
ARRIVE 8:00).  Any unspecified fields are treated  as  dont't  cares;
thus the request is matched against entries in the data base.
.BEGIN GROUP
	The table below shows how each type of knowledge was used  in
writing the three target programs.  All numbers refer to BEINGs.

.NOFILL INDENT 6 SELECT 7


.ONCE CENTER
        ⊗5U  S  E  D     I  N     S  Y  N  T  H  E  S  I  Z  I  N  G⊗*         

TYPE OF          CF    CF    CF    CF     GI    PL     not     Crea    Crea    Crea    Total 	 Total
KNOWLEDGE        GI    GI    PL  only   only   only    used    -ted    -ted    -ted    BEINGs	 BEINGs
                 PL  only   only                       ever    in CF   in GI   in PL		 in PUP6

Programming      39     7     5     5     3      1       14       52      27      21      174	    74
Domain-Specific   0     3     0     9     8      1        5        4      10       3       43	    26
Total BEINGs     39    10     5    14    11      2       19       56      37      24      217	   100
.END

"Created" means "written by PUP6 during the dialogue which synthesized".
Notice the percentage of programming BEINGs  which  were used in  all
three dialogues (39/74). The three domain-specific BEINGs used
in both CF and GI  sessions  indicate  that  they  may  be  Inductive
Inference  domain specific.  They deal with partitioning a domain,
and aren't used in PL, which is not an
inductive inference task.

	Although BEINGs can theoretically handle
user errors, PUP6 was set up expecting that the user would never  err
or  forget.  He could, after the program was finished, try it out and
describe how he wished it changed. Occasionally, PUP6  actually  make
the  right modifications. For example, 
after the synthesis of PL is finished, the user tries
out the program and finds that he is unable to delete more  than  one
reservation  with  any  single  command. He complains about this, and
PUP6 modifies the program so that he may specify a pattern,  and  all
reservations  matching  that  pattern  will  be  purged.   
There is only one sentence, however, which is able to effect this change; 
the slightest deviation would cause it to be misunderstood!

.B

.E
.ONCE CENTER
⊗5↓_13. Numerical efficiency data_↓⊗*

	Good measures  of  concentration  of intelligence are not yet
available.  The only alternative is to present  ephemeral  numerical
efficiency  data,  which now follows. PUP6 is 200 pages long when
PRETTY-PRINTED, and has synthesized three programs, 7, 20, and 30
pages long (1, 4, and 6 pages, when coded by hand.)
	Two important factors are omitted when simply comparing system
and target lengths. First, one might improve any such measure by simply
decreasing the size of a given system. This is wrong, since, e.g., 
removing all the comments from a program shouldn't increase its
intelligence rating!  Secondly, the total volume of distinct target programs
should be considered. Compilers turn out programs small compared to
themselves, but are valuable because of the vast variety of such programs
they can put together.
This factor reflects how much of the "insides" of the system can be used in
writing many programs.
	PUP6  takes  60  cpu minutes to produce CF (compiled INTERLISP
code, run on a PDP-10 TENEX system). During this time,
300K characters get typed out to the user, who  reponds  with  about
4K  himself. The dialogue lengths are best specified in characters,
since often the user will supply simply a number or  a  letter  or  a
YES/NO  reply.
Even  the  character  counts  are
very  rough,  because  the  format  of  the AP dialogue can easily be
varied by a couple orders of magnitude.  The mean wait time  (between
the user's input and the next machine output) is several seconds. The
longest delay  between  successive  user  inputs  is  1  CPU  minute.
Unfortunately,  PUP6  requires 100K to run in, which (with INTERLISP)
exhausts the virtual memory of the current hardware being  used.  One
would lose vast amounts of time by using intermediate storage devices
or by running consecutive communicating jobs. Efficient use of multiple
processors and of extended core storage might be made.
	PL  was  one fifth the size of CF, and took about a seventh
as long to generate. The dialogue was shorter by only 
a factor of three. The
dialogue for the CF target program was too long and tiresome; the problem
was even more acute for the PL program.

.B

.E
.ONCE CENTER
⊗5↓_14. Conclusions_↓⊗*

	The  main  successes  of the experiment were that the desired
reasoning steps in the original protocol
were actually simulated, most of the BEINGs were used
in  writing  more  than one of the programs, and the synthesized code
could be interrupted and asked a few kinds of  questions.   The  main
defects  were  the  inflexibility of the system to new dialogues, the
inability for  the  user  to  add  new,  high-level,  domain-specific
knowledge,  the verbosity of the system, and its ill-founded dependence on
user reliability.  These problems seem to
be inherent in the task of nonformal automatic synthesis of large programs.
The small fraction of organs which were actually relevant to each BEING (a
third) indicates that this scheme should be used as one mechanism in 
some AI systems, but not as the sole curator of information and control.
	The  fact  that target code is in the format of BEINGs limits
the size of the target programs (≥ one page) and their efficiency  (a
BEING-call  is a very slow and intricate process) and of course their
style. The most startling result -- which should have been forseen --
was that the synthesized code has all the properties of BEINGs.
This was mentioned
casually earlier, 
but is worth restating: the generated code
is  self-commenting  in the strong sense that it
can answer any
(of our set of 29) questions about itself.  Those which make sense at
compile-time  can  be  asked  then;  those  which  make  sense during
execution may be asked then.
	The set of questions the user was expected to want to ask the
system  is  similar  to the questions one BEING wants to ask another.
So when the "nice" user interrupts, his questions are almost always a
simple  retrieval  from a property list (a GETP or a composition like
EVAL o GETP). When the average  user  interrupts,  his  questions  are
often unintelligible to PUP6.
	Meaningful use of comments proved helpful. As an example, the
comment at the end of the main outer loop was "COMMENT, INFINITE LOOP
IN THIS PROG" for most of the CF-producing session. Near the  end
of  the  dialogue, the  CLARIFY-IMPROBABLE-SITUATION BEING recognizes
this as a warning, works on introducing a breakaway  test,  and  then
replaces  the  comment by one indicating that no infinite loop exists
there anymore.   The  advantage  of  embedding  our
insertions in the code in this way is that, at any stage, the user can
inspect the code and get something meaningful  out  of  the  comments
which are present.
	The  careful  bookkeeping   actually   did   eliminate   some
carelessness  errors, though it had no effect on user errors or later
program maintenance  directives.  This   last    process     is  less
ill-defined  than  blind debugging, and in fact is  similar to
programming itself.  The deferral  of  decisions  has  the
corollary of reducing (though not minimizing) communication with
the slow user channel.
	Synthesizing a new, 
clean target program  probably  would  require
adding  a  few domain-independent BEINGs and several domain-specific
BEINGs, and generalizing a few organs of  some  existing  BEINGs.
Hopefully, no new organ would need be added to BEING anatomy.
Since  the  dialogues  for  GI  and PL were not studied before-hand,
their acceptability  would  have  demonstrated  the  success  of  the
system.  Most of the existing BEINGs were used in generating these
new programs, but a few new BEINGs had to be added. This addition
process required some detailed knowledge of the PUP6 system, not just of
the domain.  Equally
discouraging, the dialogue to write PL is too long and detailed
for the task at hand.  It also  seems  that  the  front  end  is  too
brittle  to  allow  anyone  unfamiliar with the system to easily work
with it.
	The need for flexible communication stems  only
partially from inter-user  differences.   A  serious  and  unexpected
source  of  conflict  here  is the amount of inference the user wants
done.  This level is relatively subjective, and may fluctuate rapidly
even  with  one  user  writing one program. Perhaps there should be a
dial he can set. At one extreme, the system would  ask  the  user  to
verify  even  the  most  trivial  inductions.  At  the  other extreme
setting, the system would probably write  the  target  program  after
just a few brief user inputs. The user would then try out one program
after another, interjecting just one or two comments each time, after
which  the  system would come up with the next version.  This program
modification mode seems appropriate for someone ignorant of LISP, who
nevertheless has the I/O behavior of his target clearly in mind.
	Some of the BEING organs  proved  embarrassingly  unnecessary.
The  CO-REQUISITES  organ was never used.  The only activity which had
to be done concurrently was demon activation. The AFFECTS organ was of
interest  to  the  user  occasionally,  but  never to any BEING.  The
EFFECTS organ originally had a twin, describing what would  happen  if
each  BEING  were  ⊗4not⊗*  called right now.  In each case, this was
merely a trivial restatement of the frame problem.  So,  like  STRIPS,
PUP6  ignores  the  frame  problem:     all changes must be
explicit.
	Much  of PUP6's comments are boring and unnecessary; this was
felt to be an engineering problem which would be ignored in all but a
"marketable"  AP system. This view was probably incorrect. One of the
most severe AP problems is having  the  system  inform  the  user  of
precisely  what  he should know -- no more nor less.  This requires a
sophisticated  user  model  cleverly  interfaced   to   the   current
programming task.
	Another problem with large, long dialogues is user  error.  A
human  has  great  difficulty keeping "on top" of everything.  He may
get lost, forget, change his  mind,  or  misunderstand.  Again,  this
problem  was originally considered ignorable, but has shown itself to
be  a  limiting  practical  factor  in  wide  accessability.  It  was
necessary  to  create  a  forgetful-user  demon  to prevent anaphoric
reference to something which, though uniquely determined, hadn't been
mentioned within the past several seconds of real time.
Related to this is the problem of keeping
the user informed of where he is. PUP6 simulated a continuous display
of  the graph of the current partial program, augmented by static and
dynamic cursors. This  simple 
use  of  dynamic  and  textual  indices  seems
insufficient.   The  user was still often confused, wishing a section
referred to not by pointing, not by name,
but rather by a brief, meaningful (to him
only!)  phrase.   This may also be one of the major AP problems which
must be studied further.
	Will we ever need a
system whose target programs will be longer and more complex than the
system  itself?
Why couldn't we settle for a truly intelligent
AP system several times as large as anything it
could produce?
Worries  about large dialogues arise because future AP
systems are viewed as tools for writing programs which humans  simply
cannot  manage  currently:      viable natural language systems, huge
operating systems, better AP systems.
Is there any evidence that such hyper-productive systems could ever
be written? There seems
to be a manageable body of information which one needs master
to do programming.  One can
write  programs  as complex as he wishes if the program is designed in a
clean way. Thus the size of AP systems will probably
level  off  (albeit  huge
compared  to  existing  programs)  even as the size and complexity of
their generated code continues to increase and,  eventually,  surpass
them.

.B

.E
.ONCE CENTER
⊗5↓_References_↓⊗*

.SPACING 35 MILLS
.PREFACE 110 MILLS

[1] Balzer, Robert, ⊗4Automatic Programming⊗*, Institute Technical
Memorandum,  University of Southern California
Information Sciences Institute, September, 1972.

[2] Gadwa, Peter, ⊗4SPOT, a mini concept formation program⊗*,
Master's Thesis, Artificial Intelligence Laboratory,
Stanford University, Stanford, California, August, 1973.

[3] Green, Waldinger, Barstow, Elschlager, Lenat, McCune, Shaw, and
Steinberg,
⊗4Progress Report on Program-Understanding Systems⊗*, Memo AIM-240,
CS Report STAN-CS-74-444,Artificial Intelligence Laboratory,
Stanford University, August, 1974.

[4] Hempel, Carl G., ⊗4Fundamentals of Concept Formation in
Empirical Science⊗*, University of Chicago, Chicago, Illinois,
1952.

[5] Hewitt, Carl, ⊗4A Universal Modular ACTOR Formalism for
Artificial Intelligence⊗*, 3rd International Joint Conference on
Artificial Intelligence,
1973, pp. 235-245.

[6] Minsky, Marvin, ⊗4Frames⊗*, in ⊗4Psychology of Computer
Vision⊗*, 1974.

[7] Minsky, Marvin, and Papert, Seymour, ⊗4Artificial
Intelligence Progress Report⊗*, MIT Project MAC, AI Memo
252, 1972.

[8] Newell, Allen, and Simon, Herbert, 
⊗4Human Problem Solving⊗*, 1973.

[9] Teitelman, Warren, ⊗4INTERLISP Reference
Manual⊗*, XEROX PARC, 1974.

[10] Winston, Patrick, ⊗4Learning Structural Descriptions
from Examples⊗*, Ph.D. thesis, Dept. of Electrical Engineering,
TR-76, Project MAC, TR-231,
MIT AI Lab,
September, 1970.


The ideas and the system described use concepts from ACTORS, 
heterarchy, structured programming, assertional data bases,
flexible data types, pattern-directed invocation of procedural
knowledge, the paradigm of dialogue, studies on program specification,
QLISP, INTERLISP, LISP, English,... In particular, the author  
drew heavily from creative discussions with C. Green, R. Waldinger,
D. Barstow, B. McCune, D. Shaw, E. Sacerdoti, and L. Steinberg.
Computer time for the research was generously provided by the
Artificial Intelligence Center of SRI.
.SKIP 2
.ONCE CENTER
⊗5↓_Appendix: the BEINGs_↓⊗*

.PREFACE 300 MILLS

Space does not permit detailed discussion of each BEING, but the
names are sufficiently suggestive to give the reader a rough 
characterization of the scope of the knowledge in the PUP6 system.

.SELECT 3 
ADAPT:PRECONCEIVED:FUNCTION
ADD:DEFINITION
ADJECTIVE:HANDLER
ANALYZE:IMPLICATIONS
APPLYRULE
ASK:USER:ABOUT
BETTER
CHOOSE:FROM
CLARIFY:IMPROBABLE:SITUATION
CLASSIFICATORY:CONCEPT:FORMATION
COMPARE
COMPARITIVE:CONCEPT:FORMATION
COMPLEX:ALTERATION
COMPLEX:COMPARE:FN
COMPLEX:MODIFY:STRUCTURE
CONCEPT:FORMATION
CONDITIONAL:DELETION
CONDITIONAL:INSERTION
CONSTRAIN
CONTRADICTS
DATA:STRUCTURE:DELETIONS
DATA:STRUCTURE:INSERTIONS
DEFER:DECISION
ELEMENT
ENCODE
EXAMINE:STRUCTURE
EXTRACT:RELEVANT:SUBSET
FAST:GET:NAME
FAST:SATISFY
FILL:IN:UNDEFINED:SECTION
FIX:INCORRECT:PIECE
FOREACH
GET:DATA:STRUCTURE
GET:HOLD:OF
GET:NAME
GET:NEW:INFORMATION
GRAMMATICAL:INFERENCE
INFER:CONTEXTFREE:GRAMMARS
INFER:CONTEXTSENSITIVE:GRAMMARS
INFER:FIXEDCLASS:GRAMMARS
INFER:MULTICLASS:GRAMMARS
INFER:PHRASESTRUCTURE:GRAMMARS
INFER:REGULAR:GRAMMARS
IS:OF:TYPE
JOINING:FUNCTION
LIST:STRUCTURE
MAJOR:MODIFY:STRUCTURE
MAKE:A:GUESS
MAKE:ENCODABLE
MAKE:NEW:BEING
MESSAGE
METRICAL:CONCEPT:FORMATION
MODIFY:SOME
MODIFY:STRUCTURE
MODIFY:UNTIL
OBTAIN:USABLE:INFORMATION
OPTIMIZE
PARSE
PARSE:BACKWARD
PARSE:FORWARD
PARTITION:A:DOMAIN
PARTITION:BY:TAKE:CLASS:GET:ELE
PARTITION:BY:TAKE:ELE:AND:CLASS
PARTITION:BY:TAKE:ELE:GET:CLASS
PATTERN:MATCH
PROBABILITY=0:#
PROBABILITY=1:#
PROBABILITY>0&<1:#
PROPOSE:PLAUSIBLE:NAMES
RECOGNIZE:ARGS
RECOGNIZE:CAR
RECOGNIZE:CONDITIONAL
RECOGNIZE:CONJUNCTION
RECOGNIZE:EQUALITY
RECOGNIZE:FUNCTION:RETURNS
RECOGNIZE:INCLUSION
RECOGNIZE:LITERALS
RECOGNIZE:NUMBER
RECOGNIZE:SET:RELATIONS
RECOGNIZE:SOME:MEMBER
RECOGNIZE:TAIL
REINVESTIGATE:DECISION
REPEATEDLY
RESOLVE:DECISION
SATISFY
SCENE
SEARCH
SERVE
SIMPLE:COMPARE:FN
SOME:PART:OF
STRING
STRUCTURE-INDUCE
STUDY:TYPE
SUPPORT&DUMP
TAKE:HOLD:OF
TEST
TRANSLATE
TRICHOTOMY
USE:INFORMATION
UTILIZE
WHEN:NEXT
WRITE:PROGRAM

.B

.E
.ONCE CENTER